Radiotherapy and Oncology
○ Elsevier BV
Preprints posted in the last 30 days, ranked by how well they match Radiotherapy and Oncology's content profile, based on 11 papers previously published here. The average preprint has a 0.12% match score for this journal, so anything above that is already an above-average fit.
Hu, K.; Shah, P.; Nguyen, M. C.; McCluskey, C.; Kane, A.; Ove, R.; Willey, C.; Katz, S.; Marathe, O.; Valentin, S.; Frustino, J.; Villa, A.; Spencer, S.; Holtzapfel, C.; Treister, N.; Lalla, R.
Show abstract
PurposeThis study evaluated the safety and effectiveness of an intraoral light-emitting diode (LED)-based photobiomodulation (PBM) device to reduce the incidence and severity of oral mucositis (OM) from intensity modulated radiation therapy (IMRT) for head and neck cancer (HNC). MethodsThis randomized, double-blind, sham-controlled trial enrolled patients with HNC undergoing high-dose IMRT over 6-8 weeks, with or without concurrent chemotherapy. Participants received daily 10-minute PBM or sham treatments immediately before IMRT sessions. Assessments were conducted at baseline, daily and weekly during IMRT, and two weeks post-IMRT. ResultsEighty-five participants (42 PBM; 43 sham) were enrolled across 12 US sites. No device-related adverse events were observed, and 99.5% of initiated sessions were completed. In the intent-to-treat population, severe OM (WHO Grade [≥]3) incidence was significantly lower with PBM across six weeks of IMRT (36.8% vs 57.1%; p = 0.046) and at two weeks post-treatment (10.8% vs 36.4%; p = 0.042). In the per-protocol population, the PBM arm reported significantly greater taste preservation (p = 0.034), lower increases in mouth/throat soreness (p = 0.029) and throat pain (p = 0.028) and needed fewer feeding tube placements (p = 0.073) than the control arm. ConclusionDaily intraoral PBM therapy using an LED-based device was safe, well tolerated, and significantly reduced the incidence of severe OM and associated complications in HNC patients undergoing IMRT with or without concurrent chemotherapy. These findings align with guidelines recommending daily intraoral PBM therapy for preventing cancer therapy-related OM, a dose-limiting toxicity for which effective preventive interventions are needed. Trial RegistrationClinicalTrials.gov Registration Number NCT03972527. Registered on June 3, 2019. Concise SummaryDaily intraoral PBM therapy using an LED-based device was safe, well tolerated, and significantly reduced the incidence of severe OM and associated complications in HNC patients undergoing IMRT with or without concurrent chemotherapy. These findings align with guidelines recommending daily intraoral PBM therapy for preventing cancer therapy-related OM, a dose-limiting toxicity for which effective preventive interventions are needed.
McCullum, L.; Ding, Y.; Fuller, C. D.; Taylor, B. A.
Show abstract
Background and Purpose: Magnetic resonance imaging (MRI) for radiation therapy treatment planning is currently being used in many anatomical sites to better visualize soft tissue landmarks, a technique known as an MRI simulation. A core component of modern MRI simulation configurations are the use of external laser positioning systems (ELPS) to help set up the patient. Though necessary for accurate and reproducible patient setup, the ELPS, if left on during imaging, may interfere negatively with image quality due to leaking electronic noise, of which MRI is sensitive to. It is currently unknown whether this leakage of electronic noise may further affect quantitative values derived from clinically employed relaxometric, diffusion, and fat fraction sequences. Therefore, in this study, we aim to characterize the impact of MRI simulation lasers on general image quality and quantitative imaging accuracy. Materials and Methods: First, a cine acquisition was used to visualize the real-time changes in image signal-to-noise ratio (SNR) from when the ELPS was deactivated to activated. To validate this effect quantitatively, the SNR was measured using the American College of Radiology (ACR) recommended protocol in a homogeneous phantom with the integrated body, 18-channel UltraFlex small, 18-channel UltraFlex large, 32-channel spine, and 16-channel shoulder coils. Next, a geometric distortion algorithm was tested in two vendor-provided phantoms while using the integrated body coil and the ACR Large Phantom protocol was tested. Finally, a series of quantitative MRI scans were performed using a CaliberMRI Model 137 Mini Hybrid phantom to validate quantitative T1, T2, and ADC while a Calimetrix PDFF-R2* phantom was used for quantitative PDFF and R2*. All scans were performed with both the ELPS both deactivated and activated. Results: Visible electronic noise artifacts were seen when using the integrated body coil when the ELPS was activated on the cine acquisition which led to a four-fold decrease in SNR using the ACR protocol. This SNR drop was not seen when using the remaining tested coils. The automatic fiducial detection algorithm was affected negatively by ELPS activation leading to misidentification when identified perfectly with the ELPS deactivated. Degradation in image intensity uniformity, percent signal ghosting, and low contrast object detectability was seen during ACR Large Phantom testing using the 20-channel Head/Neck coil. Concordance across quantitative MRI values was similar when the ELPS was both deactivated and activated while a consistent increase in standard deviation inside the ADC vials was seen when the ELPS was activated. Discussion: The extra noise induced from the activation of the ELPS during imaging should be avoided due to its potential to unnecessarily increase image noise. This is particularly true when conducting mandatory quality assurance testing for image quality and geometric distortion which utilize the integrated body coil which is most susceptible to ELPS-induced noise. Clear clinical guidelines should be implemented to make this issue known to the MRI technologists, physicists, and other relevant staff using an MRI with a supplementary ELPS for patient alignment.
Pickersgill, N. A.; Fletcher, S. A.; Aiken, N.; Assel, M. J.; Liso, N.; Reuter, V. E.; Vickers, A. J.; Ehdaie, B.; Fine, S. W.
Show abstract
Background and ObjectiveRisk stratification in localized prostate cancer relies primarily on Grade group (GG). In GG2-4 disease, risk assignment depends on the proportions of pattern 3 and pattern 4. We hypothesized that total pattern 4 length on biopsy would better predict oncologic outcome than GG, percent pattern 4, and multivariable models ("nomograms") based on clinical variables. MethodsWe identified 2499 patients with GG2-4 prostate cancer on biopsy who underwent radical prostatectomy. Discrimination for predictors was calculated for adverse pathologic stage (seminal vesicle invasion or lymph node invasion) and biochemical recurrence (BCR). Key Findings and LimitationsTotal pattern 4 length for the case demonstrated the highest discrimination for adverse pathologic stage in comparison with GG (AUC 0.779 vs 0.658; p<0.0001), percent pattern 4 (0.719), and a model including prostate-specific antigen level, clinical stage, GG, PI-RADS score, and number of positive cores (0.762). Results were similar for BCR, with total pattern 4 length outperforming GG (C-index 0.716 vs 0.662), percent pattern 4 (0.695), and the clinical model (0.699). Neither mm of pattern 3 nor the clinical model added discrimination to total mm of pattern 4. Conclusions and Clinical ImplicationsTotal length of Gleason pattern 4 on biopsy best predicts oncologic outcome in GG2-4 prostate cancer. Other common clinicopathologic variables do not further aid discrimination. Further research is warranted to determine the optimal method for quantifying pattern 4 before incorporation into risk stratification algorithms. O_LIWhat does the study add?: Patients with Grade group 2-4 prostate cancer constitute both the largest group and the one in which treatment decision-making is most difficult. For such patients, total length of Gleason pattern 4 on biopsy predicted oncologic outcomes better than Grade group or multivariable models including the standard predictors of stage, grade, PSA, PI-RADS and number of positive cores. Neither mm of pattern 3 nor the standard predictors add discrimination once total length of pattern 4 is known. C_LIO_LIPatient Summary: Treatment decisions in prostate cancer are often determined by the ratio of pattern 4 to pattern 3 disease. We showed that, in GG2-4 disease, using the total amount of pattern 4 for the case better predicts risk and therefore provides a better basis for treatment decisions. C_LI Take Home MessageIn Grade group 2-4 prostate cancer, total Gleason pattern 4 length for the case is a stronger predictor of adverse pathologic stage and biochemical recurrence than Grade group and other standard clinical variables. Further research is warranted to determine the optimal method for quantifying pattern 4 before incorporation into risk stratification algorithms.
Aunan-Diop, J. S.; Friismose, A. I.; Yin, Z.; Hojo, E.; Krogh Pettersen, J.; Hjortdal Gronhoj, M.; Bonde Pedersen, C.; Mussmann, B.; Halle, B.; Poulsen, F. R.
Show abstract
Abstract Background: Conventional MRI cannot reliably distinguish radiation necrosis (RN) from recurrent metastasis after cranial radiotherapy, as both can show similar enhancement despite different biology. We tested whether these entities are mechanically non-equivalent in vivo and separable by MRE-derived viscoelastic metrics and perilesional interface-instability features. Methods: In a prospective, histopathology-anchored cohort, 11 post-radiotherapy enhancing lesions were classified as RN (n=3) or recurrent/progressive tumor (n=8). MRE was acquired at 3.0 T with single-frequency 60-Hz excitation to derive storage modulus (G'), loss modulus (G''), and complex shear modulus magnitude (|G*|). Co-primary endpoints were median tumor G' and |G*|, each tested one-sided (RN > tumor) with Holm correction across the two co-primary tests. Median tumor G'' was tested two-sided. A prespecified secondary 6-endpoint family (absolute and tumor/NAWM-normalized G', G'', and |G*|) was analyzed with Benjamini-Hochberg FDR control. Exploratory instability mapping in a 0- 6 mm peritumoral shell generated interface-topology metrics, including convexity. Results: Absolute tumor-core medians were higher in RN than tumor for |G*| (1.79 vs 1.32 kPa; Cliff's {delta} = 0.67; q = 0.10), G' (1.62 vs 1.09 kPa; {delta} = 0.50; q = 0.14), and G'' (0.81 vs 0.46 kPa; {delta} = 0.75; q = 0.10). NAWM normalization improved separation: tumor/NAWM |G*| (2.26 vs 1.41; {delta} = 0.92; q = 0.04) and tumor/NAWM G'' (2.67 vs 0.87; {delta} = 1.00; q = 0.04) were FDR-significant. Convexity also differentiated RN from tumor (0.49 vs 0.36; {delta} = 1.00; MWU p = 0.01). Conclusions: Tumor/NAWM G'', tumor/NAWM |G*|, convexity, and tumor G'' emerged as the strongest candidate features, indicating that RN is mechanically harder and more dissipative than recurrent metastasis. Signal strength was high (Cliff's {delta} up to 1.00) but should be interpreted cautiously given sample size. Exploratory analyses further suggest that instability mapping captures biologically relevant interface behavior. These findings support a mechanics-based RN-versus-recurrence framework and justify prespecified, preregistered external validation.
Kohn, T. P.; Coady, P. J.; Oppenheimer, A. G.; Walia, A.; Hernadez, B. S.; Kohn, J. R.; Parikh, N.; Bazzi, M.; Stocks, B.; Khera, M.; Lipshultz, L. I.
Show abstract
IntroductionNon-obstructive azoospermia (NOA) represents the most severe form of male infertility. Current clinical tools have limited ability to predict sperm production or guide surgical sperm retrieval. Conventional B-mode ultrasound provides qualitative grayscale images and cannot characterize testicular microstructure relevant to spermatogenesis. Quantitative ultrasound (QUS) provides objective parameters from raw radiofrequency data, which quantitatively measure tissue heterogeneity. We hypothesize that men with spermatogenesis will have different QUS features compared to men without spermatogenesis (measured by total motile count, TMC, on semen analysis), with the goal of identifying imaging biomarkers for prognosis and intraoperative guidance. MethodsWe prospectively analyzed men presenting for infertility evaluation who underwent high-frequency ultrasound imaging and semen analysis. Imaging was performed using a 36-MHz transducer with fixed acquisition parameters. Ninety-two QUS features were extracted from manually annotated testicular regions of interest, including Nakagami distribution parameters (m, {omega}, k), envelope statistics, and texture features. Univariate associations between each QUS feature and TMC were assessed using Spearman correlation with Bonferroni correction. Top-performing features were evaluated using logistic regression and receiver operating characteristic (ROC) analysis to discriminate sperm presence or absence (TMC>0 vs TMC=0). ResultsThirty-seven men (18 azoospermic, 19 with sperm present in the ejaculate) contributed 135 regions of interest. Seventeen of 92 QUS features significantly correlated with TMC after correction. The coefficient of variation of the Nakagami k-factor within the superficial testicular parenchyma (K_Zone1_Cv) demonstrated the strongest correlation ({rho}=0.51, corrected p<0.001), suggesting that greater spatial heterogeneity in the superficial parenchyma was associated with higher sperm counts. K_Zone1_Cv discriminated sperm presence with an AUC of 0.77 (95% CI 0.60-0.92), sensitivity 73.7%, and specificity 83.3%. QUS features with the highest univariate association were highly intercorrelated, suggesting a shared biological signal. ConclusionQuantitative ultrasound-derived measures of testicular microstructure heterogeneity correlate with sperm production and demonstrate moderate discrimination of sperm presence. These findings suggest QUS may provide a non-invasive imaging biomarker of spermatogenesis. Study findings warrant further assessment and validation in male infertility for sperm retrieval prognosis and the potential for intra-operative surgical guidance.
Anderson, O.; Hung, R.; Fisher, S.; Weir, A.; Voisey, J. P.
Show abstract
Radiogenomics enables the non-invasive characterisation of the genomic and molecular properties of tumours, with epidermal growth factor receptor (EGFR) mutations in non-small cell lung cancer (NSCLC) being one of the most investigated applications. In this study, we evaluate radiomics, contrastive learning, and convolutional deep learning approaches to predict the EGFR mutation status from chest Computed Tomography (CT) images using the TCIA Radiogenomics dataset (n=115). Our results, using 10-fold cross validation, demonstrate the capacity of imaging models to predict mutation status from CT data in a manner consistent with existing literature. Among the evaluated methods, models integrating radiomic with clinical features achieved the best performance, with an AUC of 0.790 and AUPRC of 0.517, outperforming both contrastive learning (AUC=0.787) and convolutional architectures (AUC=0.763). Beyond methodological comparisons, we discuss the challenges related to clinical translation. Specifically, we contrast radiogenomics with conventional tissue biopsies, and identify scenarios where radiogenomics might be useful, either independently or in conjunction with other existing diagnostic technologies. Together these findings evidence the potential utility of radiogenomics EGFR models and provide direct architecture comparisons on the same dataset.
Makani, A.
Show abstract
Medical oncology education faces a dual crisis: knowledge velocity that outpaces static curricula and large language model (LLM) risks--hallucination and automation bias--that threaten the fidelity of AI-assisted learning. We present Onco-Shikshak V7, an AI-native adaptive learning platform that addresses both challenges through a unified cognitive architecture grounded in learning science. The system replaces isolated educational modules with four authentic clinical workflows--Morning Report, Tumor Board, Clinic Day, and AI Textbook--each scaffolded by a nine-module pedagogy engine that integrates ACT-R activation dynamics (illness scripts), Item Response Theory (adaptive difficulty), the Free Spaced Repetition Scheduler (FSRS v4), Zone of Proximal Development (scaffolding), and metacognitive calibration training (Brier score). Six specialist AI agents--medical oncology, radiation oncology, surgical oncology, pathology, radiology, and oncology navigation--engage in multi-disciplinary deliberation with per-specialty retrieval-augmented generation (RAG) grounding across nine authoritative guideline sources including NCCN, ESMO, and ASTRO. The platform provides 18 clinical cases with decision trees across six cancer types, maps every interaction to 13 ACGME Hematology-Oncology milestones, and implements four closed-loop feedback mechanisms that connect session errors to targeted flashcards, weak domains to suggested cases, and all interactions to a persistent learner profile. Technical validation confirms algorithmic correctness across eight subsystems. To our knowledge, this is the first system to unify ACT-R, IRT, FSRS, ZPD, and metacognitive calibration in a single medical education platform. Formal learner evaluation via randomized controlled trial is planned.
Choi, H.; Bae, S.; Na, K. J.
Show abstract
BackgroundAlthough deep learning models have improved individual PET analysis, image processing and quantification tasks, end-to-end automation from raw DICOM to quantitative clinical reporting remains limited, particularly in heterogeneous real-world settings. MethodsAs a proof-of-concept, an autonomous large language model (LLM)-orchestrated multi-tool agent for end-to-end PET/CT interpretation was developed. A reasoning-based text LLM selected appropriate series from raw DICOM, coordinated registration and SUV conversion, invoked segmentation and detection tools, generated maximum-intensity projections, called a vision-enabled LLM for interpretation, and synthesized structured draft reports. The system was retrospectively evaluated in 170 patients undergoing baseline FDG PET/CT for lung cancer staging, using expert reports as reference. ResultsThe agent successfully completed the full end-to-end workflow from raw DICOM selection to structured draft report generation without human intervention in all 170 examinations. Primary tumor detection achieved 100% sensitivity. For nodal involvement, sensitivity was 84.8% and specificity was 39.4%, whereas distant metastasis detection showed 70.2% sensitivity and 65.0% specificity. Discrepancy analysis of 58 nodal and 57 metastatic mismatch cases revealed systematic false-positive findings related to reactive or physiologic uptake and false-negative findings involving small-volume or anatomically atypical metastases. ConclusionLLM-orchestrated PET/CT agents can enable workflow-level automation from raw DICOM to quantification and structured draft reporting under real-world conditions. Although primary tumor detection was highly reliable, nodal and metastatic assessment revealed systematic limitations, supporting a collaborative role with continued expert oversight in complex clinical scenarios.
Ding, T.; Zhang, X.; Yu, L.
Show abstract
Our previous studies identified three microRNAs (miR-92a-1-5p, miR-375 and miR-148a-3p) potentially associated with prostate cancer (PCa), particularly in advanced stages such as bone-metastatic PCa. To evaluate their clinical diagnostic utility, we isolated extracellular vesicles (EVs) from the plasma of patients with benign prostatic hyperplasia (BPH) and PCa (including localized and bone-metastatic disease). The absolute quantification of these three miRNAs within plasma EVs was performed using digital PCR. Results indicated that miR-148a-3p alone possessed a good ability to discriminate between PCa and BPH. Notably, a combined panel of all three miRNAs demonstrated improved diagnostic performance, achieving an area under the curve (AUC) of 0.736 for distinguishing PCa from BPH. These findings suggest that the plasma EV-derived miRNA panel (miR-92a-3p, miR-148a-3p, and miR-375-3p) holds promise as an auxiliary diagnostic biomarker for PCa and may aid in identifying bone metastasis.
Lagunas, A.; Chen, P.-J.; Bruns, T. M.; Gupta, P.
Show abstract
ObjectiveThis study aimed to characterize the activation of lower urinary tract (LUT) targets in response to pudendal nerve stimulation (PNS) in awake human participants. Materials and MethodsIn this single center study, recruited participants had an implanted pudendal neurostimulator for treatment of their symptoms including overactive bladder, incontinence, urinary retention, and/or pelvic pain. Participants came in for a modified urodynamic study where a multichannel manometry catheter was placed in the lower urinary tract alongside a dual sensor urodynamics catheter. The bladder was filled and after each participant expressed a strong desire to void, PNS was applied and LUT pressures were measured. Participants attempted voids with the catheters in place to characterize LUT behavior and voiding efficiency with and without stimulation. ResultsThe study consisted of 15 participants including 13 women. Across 133 total trials contractions were observed at the distal urethra 52 times (39%) and at the proximal urethra 46 times (35%). The maximum observed pressure change occurred significantly more often at the proximal urethra than the distal urethra (p = 0.007). There was a significantly higher maximum tolerable stimulation amplitude for low frequency stimulation (2-3.1 Hz) when compared to high frequency stimulation (30-33 Hz) (p = 0.041). In one participant there were four instances of stimulation driven bladder contractions with an average pressure change of 24.3 cmH2O (standard deviation = 10.5). There was not a significant difference in voiding efficiency or maximum flow rate with and without stimulation (p = 0.76 and p = 0.45, respectively). ConclusionsPNS can affect LUT pressures at tolerable stimulation amplitudes. The absence of an effect of PNS on voiding characteristics suggests a similar mechanism of action as sacral neuromodulation.
Tariq, M.; Ruffle, J. K.; Brothwell, M.; Mohinta, S.; Kosmin, M.; Fersht, N.; Brandner, S.; Nachev, P.; Hyare, H.
Show abstract
BackgroundGlioblastoma (GBM), Isocitrate dehydrogenase-wildtype (IDH-wt) is characterised by diffuse infiltration, with progression often arising from perilesional tissue and occult white-matter damage. We investigated whether radiomics from the T2/FLAIR-defined oedema and the structural disconnectome improve prediction of progression-free survival (PFS). MethodsWe retrospectively analysed 387 adults with newly diagnosed GBM, IDH-wt treated at a single tertiary centre (2005-2020). A deep-learning pipeline segmented enhancing tumour, non-enhancing tumour, and oedema on pre-operative MRI; lesion masks were propagated to normative tractography to derive disconnectome maps. 3-D shape radiomic features extracted for each segmented region underwent appropriate feature selection. Finally, 10 tumour and 9 oedema radiomics were combined with 6 clinical features to train 3 survival models (Random Survival Forest (RSF), XGBoost, Cox proportional hazards (CPH)) that were evaluated on a held-out 20% test set using Harrells C-index, Kaplan-Meier risk stratification and time-dependent ROC curves. ResultsThe best performance was achieved by RSF using all clinical and radiomic features (C-index 0.665 vs 0.595 for clinical features only, p=0.088). Models including oedema radiomics outperformed those using tumour radiomics alone, and disconnectome features, derived from both tumour and oedema regions, were repeatedly selected among the top predictors across algorithms. Combining radiomic and clinical features improved risk stratification and 12-month early-versus-late recurrence classification (AUC 0.704 vs 0.582 for clinical features alone). ConclusionsIntegrating perilesional oedema and white-matter disconnectome MR features with clinical and molecular data enhances prediction of PFS in GBM, IDH-wt. These network-aware, multimodal survival models may support personalised risk-adapted treatment strategies pending external validation. Key Points- GBM IDH-wt exhibits a high recurrence rate despite aggressive treatment. - Addition of high-dimensional oedema and disconnectome radiomic features to clinical features showed consistent improvement in the test performance of 3 ML models. - This can support informed clinical decision-making. Importance of the StudyPrediction of progression free survival (PFS) for a patient with highly recurrent glioblastoma IDH-wt traditionally relies on clinical history, demographics, and molecular markers of the tumour. Recent literature reveals the tumours disruptive nature through its invasion of white-matter tracts and identifies its microenvironment, particularly the perilesional oedema, as a harbour of treatment resistant tumour cells. This study is the first to combine high-dimensional radiomic features of the tumour, the oedema, and their disconnectome with clinical and treatment factors to predict PFS. Using 3 model architectures (XGBoost, RSF, and CoxPH), we demonstrate consistent directional improvements in performance, on addition of radiomic features to clinical baseline models. Furthermore, oedema and disconnectome radiomics are identified as top predictor features across algorithms. This proof-of-concept study provides a reproducible multimodal pipeline, reaffirms the usability of MR radiomics, and identifies features of the oedema and the structural connectome as promising biomarkers, demanding large-scale external validation.
Kästingschäfer, K. F.; Fink, A.; Rau, S.; Reisert, M.; Kellner, E.; Nolde, J. M.; Kottgen, A.; Sekula, P.; Bamberg, F.; Russe, M. F.
Show abstract
Rationale and ObjectivesContrast-enhanced (CE) MRI provides clear corticomedullary contrast for renal compartment delineation but may be contraindicated or undesirable in routine practice. We aimed to enable automated extraction of renal imaging biomarkers from routine non-contrast-enhanced (NCE) T1-weighted MRI by transferring CE-derived compartment labels. Materials and MethodsThis retrospective single-center study (January 2017 to December 2021) included 200 participants with paired arterial-phase CE and NCE T1-weighted MRI. Cortex, medulla, and sinus were manually segmented on CE MRI and rigidly transferred to NCE MRI to provide voxel-level reference labels. A hierarchical 3D Deep Neural Patchworks model was trained on 100 examinations (90 training/10 validation) and evaluated on an independent test set of 100 examinations using the transferred CE masks on NCE as reference. Performance was assessed using Dice similarity of segmentations and biomarker agreement using volumes and surface areas (Pearson/Spearman, MAE, Lins CCC, and Bland-Altman). ResultsWhole-kidney segmentation Dice was 0.950 (left) and 0.953 (right). Total kidney volume showed high agreement with minimal bias (MAE 8.76 mL, 2.5% of mean; CCC 0.983; bias -1.56 mL; 95% limits of agreement -28.81 to 25.69 mL). Cortex volume was modestly overestimated and medulla volume underestimated, shifting predicted compartment fractions toward cortex (74.7% vs. 72,1% in ground truth; medulla 21.5% vs. 24.3%; sinus 3.8% vs. 3.6%. Sinus volume maintained high concordance despite higher Dice dispersion. Surface area was systematically underestimated with low concordance. ConclusionCE-supervised knowledge transfer enables accurate, well-calibrated kidney volumetry from routine NCE MRI and supports contrast-free renal biomarker extraction. Surface area estimation remains challenging. Take-home MessagesO_LICE-supervised label transfer enables accurate, well-calibrated contrast-free kidney volumetry on routine non-contrast T1-weighted MRI. C_LIO_LICompartment volumetry is feasible but shows systematic cortex overestimation and medulla underestimation; surface area remains non-interchangeable due to boundary uncertainty. C_LI
Readford, T. R.; Martinez, G. J.; Patel, S.; Kench, P. L.; Andia, M. E.; Ugander, M.; Giannotti, N.
Show abstract
BackgroundDynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) enables non-invasive characterization of carotid atherosclerotic plaque. PurposeTo evaluate the performance and reproducibility of a simplified DCE-MRI quantification method for carotid plaque assessment. MethodsT1-weighted black-blood DCE-MRI of the carotid arteries at 3T was performed at baseline and after six months in patients with mild-to-moderate atherosclerotic lesions in a pilot placebo-controlled randomized trial evaluating the effects of low-dose (0.5mg daily) colchicine therapy on carotid plaque volume. DCE-MRI signal intensity was measured in manually drawn regions of interest in the plaque core, remote non-atherosclerotic vessel wall, and skeletal muscle. Peak signal intensities were normalized to skeletal muscle signal in the same slice. ResultsIn patients (n=28, median [interquartile range] age 72 [64-74] years, 36% female, n=13/15 colchicine/placebo), normalized peak signal intensity was higher in the plaque core than in the remote vessel wall at both baseline (3.5 [2.3-4.1] vs 2.1 [1.7-2.5], p<0.001) and follow-up (3.2 [2.5-4.4] vs 2.0 [1.7-2.5], p<0.001). Measurements did not differ between baseline and follow-up for all patients (0.7{+/-}0.7 for plaque core, 0.6{+/-}0.4 for remote vessel wall, p>0.80 for both) nor between colchicine intervention and placebo control (p>0.35 for either region). ConclusionsNormalised peak signal intensity on DCE-MRI was consistently higher in the carotid plaque core than in the remote vessel wall, showed excellent reproducibility in both regions over six months, and was not altered by colchicine treatment. This simplified, muscle-normalised approach may facilitate future studies exploring DCE-MRI measures potentially related to plaque vulnerability.
Reinosa, R.
Show abstract
IntroductionThe precise determination of diagnostic cut-off points is essential for the development of multimarker panels in oncology. In previous work on pulmonary nodules, it was observed that the standard two-parameter logistic fit could be insufficient for biomarkers with asymmetric distributions. Furthermore, the calculation of empirical cut-off points based on graphical visualization presented limitations in precision and reproducibility. ObjectiveThis study presents a methodological advancement in the data analysis phase (Stage 1), introducing new Python algorithms for the direct analytical calculation of empirical intersections and robust mathematical modeling using Dual Annealing with both two-parameter and four-parameter logistic functions. This improved methodology feeds into the ThresholdXpert 1.0 software tool for combinatorial optimization of biomarker panels (Stage 2), and is applied here to the diagnostic challenge of hepatocellular carcinoma (HCC). MethodsThe methodology was first validated by re-analyzing a dataset of patients with pulmonary nodules (N=895). It was subsequently applied to an HCC dataset derived from the cohort of Jang et al. (208 HCC, 193 cirrhosis, 401 total), randomly divided into a training set (280) and an independent test set (121). Scripts were developed to compare the previous two-parameter logistic fit with the new two- and four-parameter logistic models. Finally, ThresholdXpert 1.0 was used for multimarker panel optimization. ResultsThe integration of empirical calculation, logistic modeling, and combinatorial optimization through ThresholdXpert 1.0 provides a robust and coherent framework for the development of multimarker diagnostic panels. The four-parameter logistic model provided additional validation without substantially modifying cut-off values for most biomarkers, confirming the stability of the approach while offering greater flexibility for complex distributions. When applied to hepatocellular carcinoma, the framework identified a molecular panel composed of AFP, PIVKA-II, OPN, and DKK-1 with sensitivity of 0.77 and specificity of 0.72, and an optimized panel incorporating inverse MELD that achieved the best overall balance (sensitivity 0.73, specificity 0.75) in independent external validation. These results demonstrate the potential of this approach as a generalizable tool for the optimized design of binary diagnostic systems in oncology. ConclusionThe integration of complementary mathematical modeling enhances the capability of ThresholdXpert 1.0 to identify robust diagnostic panels, as in some cases a single biomarker may outperform biomarker combinations, and vice versa. This approach enabled the integration of molecular biomarkers and clinical variables under a unified mathematical framework. Contactroberto117343@gmail.com
Yang, E.; Agrawal, S.; Kinslow, C. J.; Cheng, S. K.; Yang, L.; Wang, E.; Wang, T. J.; Kachnic, L. A.; Brenner, D. J.; Shuryak, I.
Show abstract
Lower-grade gliomas (World Health Organization [WHO] grades 2-3) exhibit variable treatment responses, yet clinical decisions remain guided by population-level trial results. Standard causal survival forests estimate treatment effects at individual time horizons but lack methodology to synthesize these into interpretable temporal trajectories. Here, we apply the Causal Analysis of Survival Trajectories (CAST) framework, a recently developed extension of causal survival forests that synthesizes horizon-specific causal effect estimates into smooth temporal curves while accounting for between-horizon covariances via bootstrap estimation and Ledoit-Wolf shrinkage. We apply CAST to estimate time-varying, heterogeneous effects of radiotherapy and chemotherapy in 776 patients with lower-grade gliomas from The Cancer Genome Atlas (TCGA; n=512) and the Chinese Glioma Genome Atlas (CGGA; n=264), analyzing six treatment-outcome scenarios and adjusting for age, sex, WHO grade, isocitrate dehydrogenase (IDH) mutation status, 1p/19q codeletion, and extent of resection using elastic net propensity scores with overlap weighting. CAST curves reveal that chemotherapy provides consistent, sustained benefits across both cohorts; survival probability gains peak at 0.31 at 72-84 months for TCGA overall survival and 0.46 at 48 months for progression-free survival, with restricted mean survival time gains of 18.4 and 32.5 months at 10 years, respectively. CGGA chemotherapy shows delayed but large positive effects (survival probability peak 0.48 at 108 months). Radiotherapy effects are mixed, with modest E-values indicating sensitivity to residual confounding by indication. Subgroup CAST curves identify age at diagnosis as the dominant driver of treatment effect heterogeneity (46-56% of splits). All findings are robust to placebo permutation, simulated unobserved confounder, and negative control refutation tests. The CAST framework provides a general-purpose tool for temporal treatment effect visualization applicable beyond neuro-oncology.
Hoe, Z. Y.; Ding, R.-S.; Chou, C.-P.; Hu, C.; Lee, C.-H.; Tzeng, Y.-D.; Pan, C.-T.; Lee, M.-C.; Lee, E. K.-L.
Show abstract
BackgroundBreast cancer-related lymphedema (BCRL) is a common complication following breast cancer treatment. While lymphoscintigraphy is considered the diagnostic gold standard, it is unsuitable for routine periodic monitoring or assessment of treatment efficacy. Shear wave elastography (SWE) offers a possible alternative, but traditional modes of operation limit its potential. Proposed SolutionsThe Holder-Optimized Elastography (HOE) method is introduced to eliminate pressure issues introduced by manual operation of ultrasound probes by stabilizing them above the cutis. MethodsThe HOE method was used to acquire ARFI images of high-velocity areas (HVAs, with shear wave velocity greater than 7 m/s) in limbs with and without BCRL (as confirmed and characterized by lymphoscintigraphy) in two cohorts of 15 and 125 patients. ResultsThe HOE method enabled ARFI elastography to directly and consistently visualize the effects caused by both obstructed lymphatic vessels and intraluminal lymphatic fluid as HVAs, whereas traditional hand-held methods did not. Inter-limb differences in HVA burden showed moderate diagnostic performance for detecting BCRL and grading obstruction with modest sensitivity. However, there was systematic underestimation of both early and confluent advanced lesions. ConclusionHOE-based HVA imaging has potential for rapid and non-invasive monitoring of lymphedema course and treatment response and may serve as a useful adjunct to existing diagnostic tools for BCRL. However, further technical refinements and quantitative analytic methods will be required to fully exploit the richer SWV information provided by HOE and to enhance the diagnostic utility of HVAs. Summary StatementThe Holder-Optimized Elastography method ("HOE" method) increases the diagnostic capability of ARFI elastography for breast cancer-related lymphedema, allowing for the non-invasive detection of some lymphatic obstructions but not all. Key ResultsThe Holder-Optimized Elastography (HOE) method revealed the effects caused by fluid-filled lymphatic vessels as "High-Velocity Areas" (HVAs), which are difficult to detect by conventional methods. HVA counts for detecting lymphedema (any obstruction vs. no obstruction) showed high specificity (0.86-1.00) but low sensitivity (0.57-0.67). Conversely, HVA counts for staging lymphedema (i.e. total vs. partial obstruction) showed high sensitivity (up to 1.00) but low specificity (0.48-0.66). The inter-limb difference of HVAs counted in whole-limb scans between affected and unaffected limbs (aka, the "Global Mean Difference") provided the most balanced diagnostic performance (sensitivity 0.67-0.79, specificity 0.88-0.89).
Haueise, T.; Machann, J.
Show abstract
Chemical shift-encoded magnetic resonance imaging using high-resolved 3D Dixon techniques enables the non-invasive and radiation-free assessment of whole-body adipose tissue and ectopic fat distribution. Automatic deep learning-based segmentation of metabolically relevant adipose tissue compartments and ectopic fat deposits in parenchymal tissue is the most important image processing step for the quantification of adipose tissue volumes and ectopic fat percentages from whole-body imaging. This work presents a segmentation model dedicated to the segmentation of 19 metabolically relevant adipose tissue compartments and ectopic fat deposits from whole-body Dixon MRI. The trained segmentation model is available upon request. Related post-processing routines to compute volumes and fat percentages are publicly available: https://github.com/tobihaui/WholeBodyATQuantification.
Salome, P.; Knoll, M.; Walz, D.; Cogno, N.; Dedeoglu, A. S.; Qi, A. L.; Isakoff, S. J.; Abdollahi, A.; Jimenez, R. B.; Bitterman, D. S.; Paganetti, H.; Chamseddine, I.
Show abstract
Introduction: Manual data extraction from unstructured clinical notes is labor-intensive and impractical for large-scale clinical and research operations. Existing automated approaches typically require large language models, dedicated computational infrastructure, and/or task-specific fine-tuning that depends on curated data. The objective of this study is to enable accurate extraction with smaller locally deployed models using a disease-site specific pipeline and prompt configuration that are optimized and reusable. Materials/Methods: We developed OncoRAG, a four-phase pipeline that (1) generates feature-specific search terms via ontology enrichment, (2) constructs a clinical knowledge graph from notes using biomedical named entity recognition, (3) retrieves relevant context using graph-diffusion reranking, and (4) extracts features via structured prompts. We ran OncoRAG using Microsoft Phi-3-medium-instruct (14B parameters), a midsize language model deployed locally via Ollama. The pipeline was applied to three cohorts: triple-negative breast cancer (TNBC; npatients=104, nfeatures=42; primary development), recurrent high-grade glioma (RiCi; npatients=191, nfeatures=19; cross-lingual validation in German), and MIMIC-IV (npatients=100, nfeatures=10; external testing). Downstream task utility was assessed by comparing survival models for 3-year progression-free survival built from automatically extracted versus manually curated features. Results: The pipeline achieved mean F1 scores of 0.80 +/- 0.07 (TNBC; npatients=44, nfeatures=42), 0.79 +/- 0.12 (RiCi; npatients=61, nfeatures=19), and 0.84 +/- 0.06 (MIMIC-IV; npatients=100, nfeatures=10) on test sets under the automatic configuration. Compared to direct LLM prompting and naive RAG baselines, OncoRAG improved the mean F1-score by 0.19 to 0.22 and 0.17 to 0.19, respectively. Manual configuration refinement further improved the F1-score to 0.83 (TNBC) and 0.81 (RiCi), with no change in MIMIC-IV. Extraction time averaged 1.7-1.9 seconds per feature with the 14B model. Substituting a smaller 3.8B model reduced extraction time by 57%, with a decrease in F1-score (0.03-0.10). For TNBC, the extraction time was reduced from approximately two weeks of manual abstraction to under 2.5 hours. In an exploratory survival analysis, models using automatically extracted features showed a comparable C-index to those with manual curation (0.77 vs 0.76; 12 events). Conclusions: OncoRAG, deployed locally using a mid-size language model, achieved accurate feature extraction from multilingual oncology notes without fine-tuning. It was validated against manual extraction for both retrieval accuracy and survival model development. This locally deployable approach, which requires no external data sharing, addresses a critical bottleneck in scalable oncology research.
Jahani, F.; Jiang, Z.; Nabaei, M.; Baek, S.
Show abstract
Computational growth and remodeling (G&R) models have been extentively used to investigate abdominal aortic aneurysm (AAA) progression and to support clinical decision-making. However, the development of robust predictive models is often limited by the scarcity of large-scale longitudinal imaging datasets. In this study, we propose a physics-based G&R framework to simulate AAA shape evolution and generate a virtual cohort of aneurysms, thereby addressing data limitations and enabling integration with data-driven machine learning approaches for growth prediction. The proposed arterial G&R model incorporates key mechanisms influencing aneurysm progression, including elastin degradation and stress-mediated collagen production. A modified elastin degradation formulation was introduced to generate realistic aneurysm geometries exhibiting clinically relevant features such as asymmetry and tortuosity. By systematically varying parameters governing elastin damage and collagen production, 200 distinct G&R simulations were performed to produce a diverse set of AAA geometries. The dataset was further expanded using kriging-based spatial interpolation to construct a large in silico cohort. The synthetic dataset, combined with longitudinal imaging data from 25 patients, was used to train and validate four machine learning models: Deep Belief Network (DBN), Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU). A two-step training strategy was adopted to predict maximum aneurysm diameter and growth rate based on prior geometric characteristics. The LSTM model achieved the highest performance for maximum diameter prediction (R{superscript 2} = 0.92), while the RNN demonstrated strong overall performance (R{superscript 2} = 0.90 for maximum diameter and 0.89 for growth rate). The DBN and GRU models also showed competitive predictive capability. Overall, this study demonstrates that integrating physics-based G&R simulations with machine learning enables accurate prediction of AAA growth and maximum diameter. The proposed framework provides a scalable strategy for augmenting limited clinical datasets and offers a promising tool to support personalized risk assessment and treatment planning.
Alhuzaimi, A.; Alkanhal, A.; Alruwaili, A. R. S.; Alharbi, N. S.; Alfares, F.; Aldekhyyel, R. N.; Binkheder, S.; Temsah, A.; Aljamaan, F.; Shahzad, M.; Albriek, A. Z.; Alanazi, F. I.; Alhindi, D. A.; Al-khatib, S. M.; Darweesh, A. A.; Altamimi, I.; Jamal, A.; Saad, K.; Alhasan, K.; Al-Eyadhy, A.; Malki, K. H.; Temsah, M.-H.
Show abstract
BackgroundGenerative artificial intelligence (AI) systems are increasingly used to produce medical illustrations for education; however, their anatomical accuracy in complex domains such as congenital heart disease (CHD) remains insufficiently validated. MethodsIn an assessor-blinded comparative study, we evaluated AI-generated CHD illustrations from two contemporary text-to-image platforms (ChatGPT-5/ChatGPT-Images and Gemini NanoBanana) against human-modified educational images. Twenty different CHD types were included, yielding 147 images that were assessed by 20 physicians (10 CHD experts and 10 non-specialists). Images were rated across four domains: anatomical accuracy, label usefulness, visual attractiveness, and suitability for medical education (total score range, 4-12). ResultsAmong 2,940 total image evaluations, the human-modified images demonstrated the highest anatomical accuracy (48.3% rated accurate), followed by NanoBanana (22.7%), while ChatGPT-generated images were predominantly rated as fabricated or incorrect (86.3% for ChatGPT-5 and 85.2% for ChatGPT-Images; p<0.001). Educational usability "as is" was highest for the human-modified images (37.9%) compared with NanoBanana (13.1%) and ChatGPT platforms ([≤]2.1%; p<0.001). Median overall quality scores were 8 for the human-modified CHD images and NanoBanana, versus 4 for both ChatGPT systems (p<0.001). In multivariable analysis, NanoBanana images were the closest to the human-modified images in quality (95% CI, 0.91-0.98), while ChatGPT-Images (95% CI, 0.58-0.63) and ChatGPT-5 (95% CI, 0.55-0.59) showed marked quality reductions. ConclusionsThe current generative AI systems produced visually compelling but frequently anatomically inaccurate CHD illustrations, falling substantially short of the current educational standards. Model choice strongly influences performance, with Gemini NanoBanana outperforming ChatGPT-based systems yet remaining inferior to expert-designed human-modified images. AI-generated cardiac imagery should be used only within expert-reviewed educational workflows rather than as independent instructional resources.